Integrating transcription factor binding site information with gene expression datasets
نویسندگان
چکیده
MOTIVATION Microarrays are widely used to measure gene expression differences between sets of biological samples. Many of these differences will be due to differences in the activities of transcription factors. In principle, these differences can be detected by associating motifs in promoters with differences in gene expression levels between the groups. In practice, this is hard to do. RESULTS We combine correspondence analysis, between group analysis and co-inertia analysis to determine which motifs, from a database of promoter motifs, are strongly associated with differences in gene expression levels. Given a database of motifs and gene expression levels from a set of arrays, the method produces a ranked list of motifs associated with any specified split in the arrays. We give an example using the Gene Atlas compendium of gene expression levels for human tissues where we search for motifs that are associated with expression in central nervous system (CNS) or muscle tissues. Most of the motifs that we find are known from previous work to be strongly associated with expression in CNS or muscle. We give a second example using a published prostate cancer dataset where we can simply and clearly find which transcriptional pathways are associated with differences between benign and metastatic samples. AVAILABILITY The source code is freely available upon request from the authors.
منابع مشابه
Homocysteine Induces Heme Oxygenase-1 Expression via Transcription Factor Nrf2 Activation in HepG2 Cells
Background: Elevated level of plasma homocysteine has been related to various diseases. Patients with hyperhomocysteinemia can develop hepatic steatosis and fibrosis. We hypothesized that oxidative stress induced by homocysteine might play an important role in pathogenesis of liver injury. Also, the cellular response designed to combat oxidative stress is primarily controlled by the transcripti...
متن کاملEPConDB: a web resource for gene expression related to pancreatic development, beta-cell function and diabetes
EPConDB (http://www.cbil.upenn.edu/EPConDB) is a public web site that supports research in diabetes, pancreatic development and beta-cell function by providing information about genes expressed in cells of the pancreas. EPConDB displays expression profiles for individual genes and information about transcripts, promoter elements and transcription factor binding sites. Gene expression results ar...
متن کاملInferring condition-specific transcription factor function from DNA binding and gene expression data
Numerous genomic and proteomic datasets are permitting the elucidation of transcriptional regulatory networks in the yeast Saccharomyces cerevisiae. However, predicting the condition dependence of regulatory network interactions has been challenging, because most protein-DNA interactions identified in vivo are from assays performed in one or a few cellular states. Here, we present a novel metho...
متن کاملDiscovering transcriptional modules by Bayesian data integration
MOTIVATION We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expec...
متن کاملMapping of Transcription Factor Binding Region of Kappa Casein (CSN3) Gene in Iranian Bacterianus and Dromedaries Camels
κ-casein is a glycosilated protein in mammalian milk that plays an essential role in the milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. Transcriptional regulation, a first mechanism for controlling the development of organisms, is carried out by transcription facto...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 23 3 شماره
صفحات -
تاریخ انتشار 2007